This document describes the port of ARC SGML 1.0 to the Apple Macintosh. This distribution is a minimal port. By that I mean that an attempt was made to keep the operation of the parser as close as possible to that on DOS. It has been ported to Think C 5 using the minimum console interface and to MPW as an MPW tool. A Commando interface was added to the MPW tool to make it slightly easier to use. Complete sources are provided for both development systems. Differences between the development systems are indicated with #ifdef C preprocessor directives. "THINK_C" is the symbol for Think C and "applec" is the symbol used for MPW.
Running ARC SGML in the Macintosh Environment.
For the purposes for which I use ARC SGML, the MPW tool environment makes for a nice working environment. However, the THINK C debugger is still a big winner over MPW SADE as an interactive debugger. Hence, the implementation in both environments. I use the QUED/M editor from Paragon in the Think C environment to work with SGML documents. It╒s ability to work with line numbers makes the diagnostic output from ARC SGML more palatable.
Think C 5
Running VM2 as an application is quite simple. Just double click the VM2 application icon, enter the parameters in the control window and go.
For development purposes, I create a folder called ARC SGML in the Think C 5 Development folder and copy the vm.╣ project and the source files from the folders sgmlc and sgmlh into the ╥Development:ARC SGML╙ folder.
MPW
Running vm2 as an MPW tool is quite simple too. Enter vm2 on the worksheet, enter Option-enter to get the Commando interface select options, choose the files and go. The only problem with this interface is that it is not possible to enter concatenated files. If you really want to do this, you must enter the full command on the worksheet.
Changes to ARC SGML common to both Macintosh environments.
Several changes were made to ARC SGML to get it to work in the Macintosh environment. A file called Compares is provided in the documentation folder. It is the output from an MPW Compare script between the original DOS files as ╥OLD╙ and the Macintosh distribution files as ╥NEW╙.
1. Unsigned chars
The most pervasive change by far was to convert nearly everything to unsigned characters as the document ARCTECH.DOC recommends. I may have gone overboard and made some changes where they weren't absolutely necessary, but I don't think so. The only places where I didn't convert were in the low level input/output in mcsgmlio.c and in some of the error message and trace message printing. Otherwise, it's pretty thorough.
2. Bug fixes.
Three major bugs were found in the conversion. The first was an evaluation order dependency. The second was an attempt to dereference a null pointer under certain error conditions. The third was an attempt to free already freed memory. All are fixed and comments with WMW are supplied at the appropriate spots.
3. Input/output implementation.
This routine uses C standard i/o as its base. This causes some problems because the two different C environments implement standard i/o differently. See the environment specific information for the details.
4. Option processing.
The option character has been changed from '/' to '-' on the Macintosh. This was primarily motivated by MPW but it seems to be more consistent with some other applications. Also, option processing ignores the case of option characters. Finally, the entity-id or file name can be entered in any case. The Macintosh OS does not distinguish case in file names. File name case is preserved in the trace and diagnostic files to minimize differences for MPW Compare.
5. Unsupported options.
The T (input record format) and F (path search) options of the DOS ARC SGML option are not supported on the Macintosh version. The T option could conceivably be useful in handling SGML files produced by non Macintosh systems, but it should probably be done with a dialog implementing input options. The F option should probably be implemented during a conversion to real Macintosh i/o with a dialog if a file can╒t be found.
Changes peculiar to a single environment.
Think C 5
Standard i/o support in Think C uses the text and binary modes to fopen to
toggle whether or not the ASCII CR at the end of a line (record) of a Macintosh text file gets converted to an ASCII NL character. Text files get converted, binary files do not. I chose to use text file processing in this implementation.
Printf processing in Think C is very particular about correspondence between the size of variables and their format specification. Much work was done in the tracing routines to get the traces to produce useful information. One must carefully match the type and format specification of integers and pointers to get them to print correctly.
Finally, Think C implements a console function to provide C language argument processing. This is used in this port. The console function however, fails to clean out its buffer in some situations. Older versions of the console function also wrapped output lines at funny points.
MPW
Standard i/o support in MPW C does not distinguish between text and binary mode files. They simply REVERSE the coding of the C '\r' and '\n' characters! The lexical analyzer in particular has many changes as a consequence of this. This "feature" also caught an error in some of the parsing code which was fixed.
C comment processing can't tolerate some (maybe all?) ASCII control characters in comments. MPW C terminates very ungracefully with this error and the MPW editor gives absolutely no help in finding this problem. Fortunately, QUED/M lets one look at these files and eventually weed them out. The comments in the lexical analyzer were full of these things and have been changed to octal equivalents as digits.